{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-block alert-info\" style=\"margin-top: 20px\">\n",
    "    <a href=\"https://cocl.us/corsera_da0101en_notebook_top\">\n",
    "         <img src=\"https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DA0101EN/Images/TopAd.png\" width=\"750\" align=\"center\">\n",
    "    </a>\n",
    "</div>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href=\"https://www.bigdatauniversity.com\"><img src=\"https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DA0101EN/Images/CCLog.png\" width=300, align=\"center\"></a>\n",
    "\n",
    "<h1 align=center><font size = 5>Data Analysis with Python</font></h1>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h1>Introduction</h1>\n",
    "<h3>Welcome!</h3>\n",
    "\n",
    "<p>\n",
    "In this section, you will learn how to approach data acquisition in various ways, and obtain necessary insights from a dataset. By the end of this lab, you will successfully load the data into Jupyter Notebook, and gain some fundamental insights via Pandas Library.\n",
    "</p>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h2>Table of Contents</h2>\n",
    "\n",
    "<div class=\"alert alert-block alert-info\" style=\"margin-top: 20px\">\n",
    "<ol>\n",
    "    <li><a href=\"#data_acquisition\">Data Acquisition</a>\n",
    "    <li><a href=\"#basic_insight\">Basic Insight of Dataset</a></li>\n",
    "</ol>\n",
    "\n",
    "Estimated Time Needed: <strong>10 min</strong>\n",
    "</div>\n",
    "<hr>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h1 id=\"data_acquisition\">Data Acquisition</h1>\n",
    "<p>\n",
    "There are various formats for a dataset, .csv, .json, .xlsx  etc. The dataset can be stored in different places, on your local machine or sometimes online.<br>\n",
    "In this section, you will learn how to load a dataset into our Jupyter Notebook.<br>\n",
    "In our case, the Automobile Dataset is an online source, and it is in CSV (comma separated value) format. Let's use this dataset as an example to practice data reading.\n",
    "<ul>\n",
    "    <li>data source: <a href=\"https://archive.ics.uci.edu/ml/machine-learning-databases/autos/imports-85.data\" target=\"_blank\">https://archive.ics.uci.edu/ml/machine-learning-databases/autos/imports-85.data</a></li>\n",
    "    <li>data type: csv</li>\n",
    "</ul>\n",
    "The Pandas Library is a useful tool that enables us to read various datasets into a data frame; our Jupyter notebook platforms have a built-in <b>Pandas Library</b> so that all we need to do is import Pandas without installing.\n",
    "</p>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# import pandas library\n",
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h2>Read Data</h2>\n",
    "<p>\n",
    "We use <code>pandas.read_csv()</code> function to read the csv file. In the bracket, we put the file path along with a quotation mark, so that pandas will read the file into a data frame from that address. The file path can be either an URL or your local file address.<br>\n",
    "Because the data does not include headers, we can add an argument <code>headers = None</code>  inside the  <code>read_csv()</code> method, so that pandas will not automatically set the first row as a header.<br>\n",
    "You can also assign the dataset to any variable you create.\n",
    "</p>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This dataset was hosted on IBM Cloud object click <a href=\"https://cocl.us/DA101EN_object_storage\">HERE</a> for free storage."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "# Import pandas library\n",
    "import pandas as pd\n",
    "\n",
    "# Read the online file by the URL provides above, and assign it to variable \"df\"\n",
    "other_path = \"https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DA0101EN/auto.csv\"\n",
    "df = pd.read_csv(other_path, header=None)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After reading the dataset, we can use the <code>dataframe.head(n)</code> method to check the top n rows of the dataframe; where n is an integer. Contrary to <code>dataframe.head(n)</code>, <code>dataframe.tail(n)</code> will show you the bottom n rows of the dataframe.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The first 5 rows of the dataframe\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>3</th>\n",
       "      <th>4</th>\n",
       "      <th>5</th>\n",
       "      <th>6</th>\n",
       "      <th>7</th>\n",
       "      <th>8</th>\n",
       "      <th>9</th>\n",
       "      <th>...</th>\n",
       "      <th>16</th>\n",
       "      <th>17</th>\n",
       "      <th>18</th>\n",
       "      <th>19</th>\n",
       "      <th>20</th>\n",
       "      <th>21</th>\n",
       "      <th>22</th>\n",
       "      <th>23</th>\n",
       "      <th>24</th>\n",
       "      <th>25</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>?</td>\n",
       "      <td>alfa-romero</td>\n",
       "      <td>gas</td>\n",
       "      <td>std</td>\n",
       "      <td>two</td>\n",
       "      <td>convertible</td>\n",
       "      <td>rwd</td>\n",
       "      <td>front</td>\n",
       "      <td>88.6</td>\n",
       "      <td>...</td>\n",
       "      <td>130</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>3.47</td>\n",
       "      <td>2.68</td>\n",
       "      <td>9.0</td>\n",
       "      <td>111</td>\n",
       "      <td>5000</td>\n",
       "      <td>21</td>\n",
       "      <td>27</td>\n",
       "      <td>13495</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>?</td>\n",
       "      <td>alfa-romero</td>\n",
       "      <td>gas</td>\n",
       "      <td>std</td>\n",
       "      <td>two</td>\n",
       "      <td>convertible</td>\n",
       "      <td>rwd</td>\n",
       "      <td>front</td>\n",
       "      <td>88.6</td>\n",
       "      <td>...</td>\n",
       "      <td>130</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>3.47</td>\n",
       "      <td>2.68</td>\n",
       "      <td>9.0</td>\n",
       "      <td>111</td>\n",
       "      <td>5000</td>\n",
       "      <td>21</td>\n",
       "      <td>27</td>\n",
       "      <td>16500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>?</td>\n",
       "      <td>alfa-romero</td>\n",
       "      <td>gas</td>\n",
       "      <td>std</td>\n",
       "      <td>two</td>\n",
       "      <td>hatchback</td>\n",
       "      <td>rwd</td>\n",
       "      <td>front</td>\n",
       "      <td>94.5</td>\n",
       "      <td>...</td>\n",
       "      <td>152</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>2.68</td>\n",
       "      <td>3.47</td>\n",
       "      <td>9.0</td>\n",
       "      <td>154</td>\n",
       "      <td>5000</td>\n",
       "      <td>19</td>\n",
       "      <td>26</td>\n",
       "      <td>16500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>2</td>\n",
       "      <td>164</td>\n",
       "      <td>audi</td>\n",
       "      <td>gas</td>\n",
       "      <td>std</td>\n",
       "      <td>four</td>\n",
       "      <td>sedan</td>\n",
       "      <td>fwd</td>\n",
       "      <td>front</td>\n",
       "      <td>99.8</td>\n",
       "      <td>...</td>\n",
       "      <td>109</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>3.19</td>\n",
       "      <td>3.40</td>\n",
       "      <td>10.0</td>\n",
       "      <td>102</td>\n",
       "      <td>5500</td>\n",
       "      <td>24</td>\n",
       "      <td>30</td>\n",
       "      <td>13950</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>164</td>\n",
       "      <td>audi</td>\n",
       "      <td>gas</td>\n",
       "      <td>std</td>\n",
       "      <td>four</td>\n",
       "      <td>sedan</td>\n",
       "      <td>4wd</td>\n",
       "      <td>front</td>\n",
       "      <td>99.4</td>\n",
       "      <td>...</td>\n",
       "      <td>136</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>3.19</td>\n",
       "      <td>3.40</td>\n",
       "      <td>8.0</td>\n",
       "      <td>115</td>\n",
       "      <td>5500</td>\n",
       "      <td>18</td>\n",
       "      <td>22</td>\n",
       "      <td>17450</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 26 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "   0    1            2    3    4     5            6    7      8     9   ...  \\\n",
       "0   3    ?  alfa-romero  gas  std   two  convertible  rwd  front  88.6  ...   \n",
       "1   3    ?  alfa-romero  gas  std   two  convertible  rwd  front  88.6  ...   \n",
       "2   1    ?  alfa-romero  gas  std   two    hatchback  rwd  front  94.5  ...   \n",
       "3   2  164         audi  gas  std  four        sedan  fwd  front  99.8  ...   \n",
       "4   2  164         audi  gas  std  four        sedan  4wd  front  99.4  ...   \n",
       "\n",
       "    16    17    18    19    20   21    22  23  24     25  \n",
       "0  130  mpfi  3.47  2.68   9.0  111  5000  21  27  13495  \n",
       "1  130  mpfi  3.47  2.68   9.0  111  5000  21  27  16500  \n",
       "2  152  mpfi  2.68  3.47   9.0  154  5000  19  26  16500  \n",
       "3  109  mpfi  3.19  3.40  10.0  102  5500  24  30  13950  \n",
       "4  136  mpfi  3.19  3.40   8.0  115  5500  18  22  17450  \n",
       "\n",
       "[5 rows x 26 columns]"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# show the first 5 rows using dataframe.head() method\n",
    "print(\"The first 5 rows of the dataframe\") \n",
    "df.head(5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-danger alertdanger\" style=\"margin-top: 20px\">\n",
    "<h1> Question #1: </h1>\n",
    "<b>check the bottom 10 rows of data frame \"df\".</b>\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>3</th>\n",
       "      <th>4</th>\n",
       "      <th>5</th>\n",
       "      <th>6</th>\n",
       "      <th>7</th>\n",
       "      <th>8</th>\n",
       "      <th>9</th>\n",
       "      <th>...</th>\n",
       "      <th>16</th>\n",
       "      <th>17</th>\n",
       "      <th>18</th>\n",
       "      <th>19</th>\n",
       "      <th>20</th>\n",
       "      <th>21</th>\n",
       "      <th>22</th>\n",
       "      <th>23</th>\n",
       "      <th>24</th>\n",
       "      <th>25</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>195</td>\n",
       "      <td>-1</td>\n",
       "      <td>74</td>\n",
       "      <td>volvo</td>\n",
       "      <td>gas</td>\n",
       "      <td>std</td>\n",
       "      <td>four</td>\n",
       "      <td>wagon</td>\n",
       "      <td>rwd</td>\n",
       "      <td>front</td>\n",
       "      <td>104.3</td>\n",
       "      <td>...</td>\n",
       "      <td>141</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>3.78</td>\n",
       "      <td>3.15</td>\n",
       "      <td>9.5</td>\n",
       "      <td>114</td>\n",
       "      <td>5400</td>\n",
       "      <td>23</td>\n",
       "      <td>28</td>\n",
       "      <td>13415</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>196</td>\n",
       "      <td>-2</td>\n",
       "      <td>103</td>\n",
       "      <td>volvo</td>\n",
       "      <td>gas</td>\n",
       "      <td>std</td>\n",
       "      <td>four</td>\n",
       "      <td>sedan</td>\n",
       "      <td>rwd</td>\n",
       "      <td>front</td>\n",
       "      <td>104.3</td>\n",
       "      <td>...</td>\n",
       "      <td>141</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>3.78</td>\n",
       "      <td>3.15</td>\n",
       "      <td>9.5</td>\n",
       "      <td>114</td>\n",
       "      <td>5400</td>\n",
       "      <td>24</td>\n",
       "      <td>28</td>\n",
       "      <td>15985</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>197</td>\n",
       "      <td>-1</td>\n",
       "      <td>74</td>\n",
       "      <td>volvo</td>\n",
       "      <td>gas</td>\n",
       "      <td>std</td>\n",
       "      <td>four</td>\n",
       "      <td>wagon</td>\n",
       "      <td>rwd</td>\n",
       "      <td>front</td>\n",
       "      <td>104.3</td>\n",
       "      <td>...</td>\n",
       "      <td>141</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>3.78</td>\n",
       "      <td>3.15</td>\n",
       "      <td>9.5</td>\n",
       "      <td>114</td>\n",
       "      <td>5400</td>\n",
       "      <td>24</td>\n",
       "      <td>28</td>\n",
       "      <td>16515</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>198</td>\n",
       "      <td>-2</td>\n",
       "      <td>103</td>\n",
       "      <td>volvo</td>\n",
       "      <td>gas</td>\n",
       "      <td>turbo</td>\n",
       "      <td>four</td>\n",
       "      <td>sedan</td>\n",
       "      <td>rwd</td>\n",
       "      <td>front</td>\n",
       "      <td>104.3</td>\n",
       "      <td>...</td>\n",
       "      <td>130</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>3.62</td>\n",
       "      <td>3.15</td>\n",
       "      <td>7.5</td>\n",
       "      <td>162</td>\n",
       "      <td>5100</td>\n",
       "      <td>17</td>\n",
       "      <td>22</td>\n",
       "      <td>18420</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>199</td>\n",
       "      <td>-1</td>\n",
       "      <td>74</td>\n",
       "      <td>volvo</td>\n",
       "      <td>gas</td>\n",
       "      <td>turbo</td>\n",
       "      <td>four</td>\n",
       "      <td>wagon</td>\n",
       "      <td>rwd</td>\n",
       "      <td>front</td>\n",
       "      <td>104.3</td>\n",
       "      <td>...</td>\n",
       "      <td>130</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>3.62</td>\n",
       "      <td>3.15</td>\n",
       "      <td>7.5</td>\n",
       "      <td>162</td>\n",
       "      <td>5100</td>\n",
       "      <td>17</td>\n",
       "      <td>22</td>\n",
       "      <td>18950</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>200</td>\n",
       "      <td>-1</td>\n",
       "      <td>95</td>\n",
       "      <td>volvo</td>\n",
       "      <td>gas</td>\n",
       "      <td>std</td>\n",
       "      <td>four</td>\n",
       "      <td>sedan</td>\n",
       "      <td>rwd</td>\n",
       "      <td>front</td>\n",
       "      <td>109.1</td>\n",
       "      <td>...</td>\n",
       "      <td>141</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>3.78</td>\n",
       "      <td>3.15</td>\n",
       "      <td>9.5</td>\n",
       "      <td>114</td>\n",
       "      <td>5400</td>\n",
       "      <td>23</td>\n",
       "      <td>28</td>\n",
       "      <td>16845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>201</td>\n",
       "      <td>-1</td>\n",
       "      <td>95</td>\n",
       "      <td>volvo</td>\n",
       "      <td>gas</td>\n",
       "      <td>turbo</td>\n",
       "      <td>four</td>\n",
       "      <td>sedan</td>\n",
       "      <td>rwd</td>\n",
       "      <td>front</td>\n",
       "      <td>109.1</td>\n",
       "      <td>...</td>\n",
       "      <td>141</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>3.78</td>\n",
       "      <td>3.15</td>\n",
       "      <td>8.7</td>\n",
       "      <td>160</td>\n",
       "      <td>5300</td>\n",
       "      <td>19</td>\n",
       "      <td>25</td>\n",
       "      <td>19045</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>202</td>\n",
       "      <td>-1</td>\n",
       "      <td>95</td>\n",
       "      <td>volvo</td>\n",
       "      <td>gas</td>\n",
       "      <td>std</td>\n",
       "      <td>four</td>\n",
       "      <td>sedan</td>\n",
       "      <td>rwd</td>\n",
       "      <td>front</td>\n",
       "      <td>109.1</td>\n",
       "      <td>...</td>\n",
       "      <td>173</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>3.58</td>\n",
       "      <td>2.87</td>\n",
       "      <td>8.8</td>\n",
       "      <td>134</td>\n",
       "      <td>5500</td>\n",
       "      <td>18</td>\n",
       "      <td>23</td>\n",
       "      <td>21485</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>203</td>\n",
       "      <td>-1</td>\n",
       "      <td>95</td>\n",
       "      <td>volvo</td>\n",
       "      <td>diesel</td>\n",
       "      <td>turbo</td>\n",
       "      <td>four</td>\n",
       "      <td>sedan</td>\n",
       "      <td>rwd</td>\n",
       "      <td>front</td>\n",
       "      <td>109.1</td>\n",
       "      <td>...</td>\n",
       "      <td>145</td>\n",
       "      <td>idi</td>\n",
       "      <td>3.01</td>\n",
       "      <td>3.40</td>\n",
       "      <td>23.0</td>\n",
       "      <td>106</td>\n",
       "      <td>4800</td>\n",
       "      <td>26</td>\n",
       "      <td>27</td>\n",
       "      <td>22470</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>204</td>\n",
       "      <td>-1</td>\n",
       "      <td>95</td>\n",
       "      <td>volvo</td>\n",
       "      <td>gas</td>\n",
       "      <td>turbo</td>\n",
       "      <td>four</td>\n",
       "      <td>sedan</td>\n",
       "      <td>rwd</td>\n",
       "      <td>front</td>\n",
       "      <td>109.1</td>\n",
       "      <td>...</td>\n",
       "      <td>141</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>3.78</td>\n",
       "      <td>3.15</td>\n",
       "      <td>9.5</td>\n",
       "      <td>114</td>\n",
       "      <td>5400</td>\n",
       "      <td>19</td>\n",
       "      <td>25</td>\n",
       "      <td>22625</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>10 rows × 26 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "     0    1      2       3      4     5      6    7      8      9   ...   16  \\\n",
       "195  -1   74  volvo     gas    std  four  wagon  rwd  front  104.3  ...  141   \n",
       "196  -2  103  volvo     gas    std  four  sedan  rwd  front  104.3  ...  141   \n",
       "197  -1   74  volvo     gas    std  four  wagon  rwd  front  104.3  ...  141   \n",
       "198  -2  103  volvo     gas  turbo  four  sedan  rwd  front  104.3  ...  130   \n",
       "199  -1   74  volvo     gas  turbo  four  wagon  rwd  front  104.3  ...  130   \n",
       "200  -1   95  volvo     gas    std  four  sedan  rwd  front  109.1  ...  141   \n",
       "201  -1   95  volvo     gas  turbo  four  sedan  rwd  front  109.1  ...  141   \n",
       "202  -1   95  volvo     gas    std  four  sedan  rwd  front  109.1  ...  173   \n",
       "203  -1   95  volvo  diesel  turbo  four  sedan  rwd  front  109.1  ...  145   \n",
       "204  -1   95  volvo     gas  turbo  four  sedan  rwd  front  109.1  ...  141   \n",
       "\n",
       "       17    18    19    20   21    22  23  24     25  \n",
       "195  mpfi  3.78  3.15   9.5  114  5400  23  28  13415  \n",
       "196  mpfi  3.78  3.15   9.5  114  5400  24  28  15985  \n",
       "197  mpfi  3.78  3.15   9.5  114  5400  24  28  16515  \n",
       "198  mpfi  3.62  3.15   7.5  162  5100  17  22  18420  \n",
       "199  mpfi  3.62  3.15   7.5  162  5100  17  22  18950  \n",
       "200  mpfi  3.78  3.15   9.5  114  5400  23  28  16845  \n",
       "201  mpfi  3.78  3.15   8.7  160  5300  19  25  19045  \n",
       "202  mpfi  3.58  2.87   8.8  134  5500  18  23  21485  \n",
       "203   idi  3.01  3.40  23.0  106  4800  26  27  22470  \n",
       "204  mpfi  3.78  3.15   9.5  114  5400  19  25  22625  \n",
       "\n",
       "[10 rows x 26 columns]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Write your code below and press Shift+Enter to execute \n",
    "df.tail(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-danger alertdanger\" style=\"margin-top: 20px\">\n",
    "<h1> Question #1 Answer: </h1>\n",
    "<b>Run the code below for the solution!</b>\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "jupyter": {
     "source_hidden": true
    }
   },
   "source": [
    "Double-click <b>here</b> for the solution.\n",
    "\n",
    "<!-- The answer is below:\n",
    "\n",
    "print(\"The last 10 rows of the dataframe\\n\")\n",
    "df.tail(10)\n",
    "\n",
    "-->"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h3>Add Headers</h3>\n",
    "<p>\n",
    "Take a look at our dataset; pandas automatically set the header by an integer from 0.\n",
    "</p>\n",
    "<p>\n",
    "To better describe our data we can introduce a header, this information is available at:  <a href=\"https://archive.ics.uci.edu/ml/datasets/Automobile\" target=\"_blank\">https://archive.ics.uci.edu/ml/datasets/Automobile</a>\n",
    "</p>\n",
    "<p>\n",
    "Thus, we have to add headers manually.\n",
    "</p>\n",
    "<p>\n",
    "Firstly, we create a list \"headers\" that include all column names in order.\n",
    "Then, we use <code>dataframe.columns = headers</code> to replace the headers by the list we created.\n",
    "</p>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "headers\n",
      " ['symboling', 'normalized-losses', 'make', 'fuel-type', 'aspiration', 'num-of-doors', 'body-style', 'drive-wheels', 'engine-location', 'wheel-base', 'length', 'width', 'height', 'curb-weight', 'engine-type', 'num-of-cylinders', 'engine-size', 'fuel-system', 'bore', 'stroke', 'compression-ratio', 'horsepower', 'peak-rpm', 'city-mpg', 'highway-mpg', 'price']\n"
     ]
    }
   ],
   "source": [
    "# create headers list\n",
    "headers = [\"symboling\",\"normalized-losses\",\"make\",\"fuel-type\",\"aspiration\", \"num-of-doors\",\"body-style\",\n",
    "         \"drive-wheels\",\"engine-location\",\"wheel-base\", \"length\",\"width\",\"height\",\"curb-weight\",\"engine-type\",\n",
    "         \"num-of-cylinders\", \"engine-size\",\"fuel-system\",\"bore\",\"stroke\",\"compression-ratio\",\"horsepower\",\n",
    "         \"peak-rpm\",\"city-mpg\",\"highway-mpg\",\"price\"]\n",
    "print(\"headers\\n\", headers)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " We replace headers and recheck our data frame"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>symboling</th>\n",
       "      <th>normalized-losses</th>\n",
       "      <th>make</th>\n",
       "      <th>fuel-type</th>\n",
       "      <th>aspiration</th>\n",
       "      <th>num-of-doors</th>\n",
       "      <th>body-style</th>\n",
       "      <th>drive-wheels</th>\n",
       "      <th>engine-location</th>\n",
       "      <th>wheel-base</th>\n",
       "      <th>...</th>\n",
       "      <th>engine-size</th>\n",
       "      <th>fuel-system</th>\n",
       "      <th>bore</th>\n",
       "      <th>stroke</th>\n",
       "      <th>compression-ratio</th>\n",
       "      <th>horsepower</th>\n",
       "      <th>peak-rpm</th>\n",
       "      <th>city-mpg</th>\n",
       "      <th>highway-mpg</th>\n",
       "      <th>price</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>?</td>\n",
       "      <td>alfa-romero</td>\n",
       "      <td>gas</td>\n",
       "      <td>std</td>\n",
       "      <td>two</td>\n",
       "      <td>convertible</td>\n",
       "      <td>rwd</td>\n",
       "      <td>front</td>\n",
       "      <td>88.6</td>\n",
       "      <td>...</td>\n",
       "      <td>130</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>3.47</td>\n",
       "      <td>2.68</td>\n",
       "      <td>9.0</td>\n",
       "      <td>111</td>\n",
       "      <td>5000</td>\n",
       "      <td>21</td>\n",
       "      <td>27</td>\n",
       "      <td>13495</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>?</td>\n",
       "      <td>alfa-romero</td>\n",
       "      <td>gas</td>\n",
       "      <td>std</td>\n",
       "      <td>two</td>\n",
       "      <td>convertible</td>\n",
       "      <td>rwd</td>\n",
       "      <td>front</td>\n",
       "      <td>88.6</td>\n",
       "      <td>...</td>\n",
       "      <td>130</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>3.47</td>\n",
       "      <td>2.68</td>\n",
       "      <td>9.0</td>\n",
       "      <td>111</td>\n",
       "      <td>5000</td>\n",
       "      <td>21</td>\n",
       "      <td>27</td>\n",
       "      <td>16500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>?</td>\n",
       "      <td>alfa-romero</td>\n",
       "      <td>gas</td>\n",
       "      <td>std</td>\n",
       "      <td>two</td>\n",
       "      <td>hatchback</td>\n",
       "      <td>rwd</td>\n",
       "      <td>front</td>\n",
       "      <td>94.5</td>\n",
       "      <td>...</td>\n",
       "      <td>152</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>2.68</td>\n",
       "      <td>3.47</td>\n",
       "      <td>9.0</td>\n",
       "      <td>154</td>\n",
       "      <td>5000</td>\n",
       "      <td>19</td>\n",
       "      <td>26</td>\n",
       "      <td>16500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>2</td>\n",
       "      <td>164</td>\n",
       "      <td>audi</td>\n",
       "      <td>gas</td>\n",
       "      <td>std</td>\n",
       "      <td>four</td>\n",
       "      <td>sedan</td>\n",
       "      <td>fwd</td>\n",
       "      <td>front</td>\n",
       "      <td>99.8</td>\n",
       "      <td>...</td>\n",
       "      <td>109</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>3.19</td>\n",
       "      <td>3.40</td>\n",
       "      <td>10.0</td>\n",
       "      <td>102</td>\n",
       "      <td>5500</td>\n",
       "      <td>24</td>\n",
       "      <td>30</td>\n",
       "      <td>13950</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>164</td>\n",
       "      <td>audi</td>\n",
       "      <td>gas</td>\n",
       "      <td>std</td>\n",
       "      <td>four</td>\n",
       "      <td>sedan</td>\n",
       "      <td>4wd</td>\n",
       "      <td>front</td>\n",
       "      <td>99.4</td>\n",
       "      <td>...</td>\n",
       "      <td>136</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>3.19</td>\n",
       "      <td>3.40</td>\n",
       "      <td>8.0</td>\n",
       "      <td>115</td>\n",
       "      <td>5500</td>\n",
       "      <td>18</td>\n",
       "      <td>22</td>\n",
       "      <td>17450</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5</td>\n",
       "      <td>2</td>\n",
       "      <td>?</td>\n",
       "      <td>audi</td>\n",
       "      <td>gas</td>\n",
       "      <td>std</td>\n",
       "      <td>two</td>\n",
       "      <td>sedan</td>\n",
       "      <td>fwd</td>\n",
       "      <td>front</td>\n",
       "      <td>99.8</td>\n",
       "      <td>...</td>\n",
       "      <td>136</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>3.19</td>\n",
       "      <td>3.40</td>\n",
       "      <td>8.5</td>\n",
       "      <td>110</td>\n",
       "      <td>5500</td>\n",
       "      <td>19</td>\n",
       "      <td>25</td>\n",
       "      <td>15250</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>6</td>\n",
       "      <td>1</td>\n",
       "      <td>158</td>\n",
       "      <td>audi</td>\n",
       "      <td>gas</td>\n",
       "      <td>std</td>\n",
       "      <td>four</td>\n",
       "      <td>sedan</td>\n",
       "      <td>fwd</td>\n",
       "      <td>front</td>\n",
       "      <td>105.8</td>\n",
       "      <td>...</td>\n",
       "      <td>136</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>3.19</td>\n",
       "      <td>3.40</td>\n",
       "      <td>8.5</td>\n",
       "      <td>110</td>\n",
       "      <td>5500</td>\n",
       "      <td>19</td>\n",
       "      <td>25</td>\n",
       "      <td>17710</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>7</td>\n",
       "      <td>1</td>\n",
       "      <td>?</td>\n",
       "      <td>audi</td>\n",
       "      <td>gas</td>\n",
       "      <td>std</td>\n",
       "      <td>four</td>\n",
       "      <td>wagon</td>\n",
       "      <td>fwd</td>\n",
       "      <td>front</td>\n",
       "      <td>105.8</td>\n",
       "      <td>...</td>\n",
       "      <td>136</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>3.19</td>\n",
       "      <td>3.40</td>\n",
       "      <td>8.5</td>\n",
       "      <td>110</td>\n",
       "      <td>5500</td>\n",
       "      <td>19</td>\n",
       "      <td>25</td>\n",
       "      <td>18920</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>8</td>\n",
       "      <td>1</td>\n",
       "      <td>158</td>\n",
       "      <td>audi</td>\n",
       "      <td>gas</td>\n",
       "      <td>turbo</td>\n",
       "      <td>four</td>\n",
       "      <td>sedan</td>\n",
       "      <td>fwd</td>\n",
       "      <td>front</td>\n",
       "      <td>105.8</td>\n",
       "      <td>...</td>\n",
       "      <td>131</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>3.13</td>\n",
       "      <td>3.40</td>\n",
       "      <td>8.3</td>\n",
       "      <td>140</td>\n",
       "      <td>5500</td>\n",
       "      <td>17</td>\n",
       "      <td>20</td>\n",
       "      <td>23875</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>9</td>\n",
       "      <td>0</td>\n",
       "      <td>?</td>\n",
       "      <td>audi</td>\n",
       "      <td>gas</td>\n",
       "      <td>turbo</td>\n",
       "      <td>two</td>\n",
       "      <td>hatchback</td>\n",
       "      <td>4wd</td>\n",
       "      <td>front</td>\n",
       "      <td>99.5</td>\n",
       "      <td>...</td>\n",
       "      <td>131</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>3.13</td>\n",
       "      <td>3.40</td>\n",
       "      <td>7.0</td>\n",
       "      <td>160</td>\n",
       "      <td>5500</td>\n",
       "      <td>16</td>\n",
       "      <td>22</td>\n",
       "      <td>?</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>10 rows × 26 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "   symboling normalized-losses         make fuel-type aspiration num-of-doors  \\\n",
       "0          3                 ?  alfa-romero       gas        std          two   \n",
       "1          3                 ?  alfa-romero       gas        std          two   \n",
       "2          1                 ?  alfa-romero       gas        std          two   \n",
       "3          2               164         audi       gas        std         four   \n",
       "4          2               164         audi       gas        std         four   \n",
       "5          2                 ?         audi       gas        std          two   \n",
       "6          1               158         audi       gas        std         four   \n",
       "7          1                 ?         audi       gas        std         four   \n",
       "8          1               158         audi       gas      turbo         four   \n",
       "9          0                 ?         audi       gas      turbo          two   \n",
       "\n",
       "    body-style drive-wheels engine-location  wheel-base  ...  engine-size  \\\n",
       "0  convertible          rwd           front        88.6  ...          130   \n",
       "1  convertible          rwd           front        88.6  ...          130   \n",
       "2    hatchback          rwd           front        94.5  ...          152   \n",
       "3        sedan          fwd           front        99.8  ...          109   \n",
       "4        sedan          4wd           front        99.4  ...          136   \n",
       "5        sedan          fwd           front        99.8  ...          136   \n",
       "6        sedan          fwd           front       105.8  ...          136   \n",
       "7        wagon          fwd           front       105.8  ...          136   \n",
       "8        sedan          fwd           front       105.8  ...          131   \n",
       "9    hatchback          4wd           front        99.5  ...          131   \n",
       "\n",
       "   fuel-system  bore  stroke compression-ratio horsepower  peak-rpm city-mpg  \\\n",
       "0         mpfi  3.47    2.68               9.0        111      5000       21   \n",
       "1         mpfi  3.47    2.68               9.0        111      5000       21   \n",
       "2         mpfi  2.68    3.47               9.0        154      5000       19   \n",
       "3         mpfi  3.19    3.40              10.0        102      5500       24   \n",
       "4         mpfi  3.19    3.40               8.0        115      5500       18   \n",
       "5         mpfi  3.19    3.40               8.5        110      5500       19   \n",
       "6         mpfi  3.19    3.40               8.5        110      5500       19   \n",
       "7         mpfi  3.19    3.40               8.5        110      5500       19   \n",
       "8         mpfi  3.13    3.40               8.3        140      5500       17   \n",
       "9         mpfi  3.13    3.40               7.0        160      5500       16   \n",
       "\n",
       "  highway-mpg  price  \n",
       "0          27  13495  \n",
       "1          27  16500  \n",
       "2          26  16500  \n",
       "3          30  13950  \n",
       "4          22  17450  \n",
       "5          25  15250  \n",
       "6          25  17710  \n",
       "7          25  18920  \n",
       "8          20  23875  \n",
       "9          22      ?  \n",
       "\n",
       "[10 rows x 26 columns]"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.columns = headers\n",
    "df.head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "we can drop missing values along the column \"price\" as follows  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df.dropna(subset=[\"price\"], axis=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, we have successfully read the raw dataset and add the correct headers into the data frame."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " <div class=\"alert alert-danger alertdanger\" style=\"margin-top: 20px\">\n",
    "<h1> Question #2: </h1>\n",
    "<b>Find the name of the columns of the dataframe</b>\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['symboling', 'normalized-losses', 'make', 'fuel-type', 'aspiration',\n",
       "       'num-of-doors', 'body-style', 'drive-wheels', 'engine-location',\n",
       "       'wheel-base', 'length', 'width', 'height', 'curb-weight', 'engine-type',\n",
       "       'num-of-cylinders', 'engine-size', 'fuel-system', 'bore', 'stroke',\n",
       "       'compression-ratio', 'horsepower', 'peak-rpm', 'city-mpg',\n",
       "       'highway-mpg', 'price'],\n",
       "      dtype='object')"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Write your code below and press Shift+Enter to execute \n",
    "df.columns"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "jupyter": {
     "source_hidden": true
    }
   },
   "source": [
    "Double-click <b>here</b> for the solution.\n",
    "\n",
    "<!-- The answer is below:\n",
    "\n",
    "print(df.columns)\n",
    "\n",
    "-->"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h2>Save Dataset</h2>\n",
    "<p>\n",
    "Correspondingly, Pandas enables us to save the dataset to csv  by using the <code>dataframe.to_csv()</code> method, you can add the file path and name along with quotation marks in the brackets.\n",
    "</p>\n",
    "<p>\n",
    "    For example, if you would save the dataframe <b>df</b> as <b>automobile.csv</b> to your local machine, you may use the syntax below:\n",
    "</p>"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {},
   "source": [
    "df.to_csv(\"automobile.csv\", index=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " We can also read and save other file formats, we can use similar functions to **`pd.read_csv()`** and **`df.to_csv()`** for other data formats, the functions are listed in the following table:\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h2>Read/Save Other Data Formats</h2>\n",
    "\n",
    "\n",
    "\n",
    "| Data Formate  | Read           | Save             |\n",
    "| ------------- |:--------------:| ----------------:|\n",
    "| csv           | `pd.read_csv()`  |`df.to_csv()`     |\n",
    "| json          | `pd.read_json()` |`df.to_json()`    |\n",
    "| excel         | `pd.read_excel()`|`df.to_excel()`   |\n",
    "| hdf           | `pd.read_hdf()`  |`df.to_hdf()`     |\n",
    "| sql           | `pd.read_sql()`  |`df.to_sql()`     |\n",
    "| ...           |   ...          |       ...        |"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h1 id=\"basic_insight\">Basic Insight of Dataset</h1>\n",
    "<p>\n",
    "After reading data into Pandas dataframe, it is time for us to explore the dataset.<br>\n",
    "There are several ways to obtain essential insights of the data to help us better understand our dataset.\n",
    "</p>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h2>Data Types</h2>\n",
    "<p>\n",
    "Data has a variety of types.<br>\n",
    "The main types stored in Pandas dataframes are <b>object</b>, <b>float</b>, <b>int</b>, <b>bool</b> and <b>datetime64</b>. In order to better learn about each attribute, it is always good for us to know the data type of each column. In Pandas:\n",
    "</p>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "symboling              int64\n",
       "normalized-losses     object\n",
       "make                  object\n",
       "fuel-type             object\n",
       "aspiration            object\n",
       "num-of-doors          object\n",
       "body-style            object\n",
       "drive-wheels          object\n",
       "engine-location       object\n",
       "wheel-base           float64\n",
       "length               float64\n",
       "width                float64\n",
       "height               float64\n",
       "curb-weight            int64\n",
       "engine-type           object\n",
       "num-of-cylinders      object\n",
       "engine-size            int64\n",
       "fuel-system           object\n",
       "bore                  object\n",
       "stroke                object\n",
       "compression-ratio    float64\n",
       "horsepower            object\n",
       "peak-rpm              object\n",
       "city-mpg               int64\n",
       "highway-mpg            int64\n",
       "price                 object\n",
       "dtype: object"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.dtypes\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "returns a Series with the data type of each column."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "symboling              int64\n",
      "normalized-losses     object\n",
      "make                  object\n",
      "fuel-type             object\n",
      "aspiration            object\n",
      "num-of-doors          object\n",
      "body-style            object\n",
      "drive-wheels          object\n",
      "engine-location       object\n",
      "wheel-base           float64\n",
      "length               float64\n",
      "width                float64\n",
      "height               float64\n",
      "curb-weight            int64\n",
      "engine-type           object\n",
      "num-of-cylinders      object\n",
      "engine-size            int64\n",
      "fuel-system           object\n",
      "bore                  object\n",
      "stroke                object\n",
      "compression-ratio    float64\n",
      "horsepower            object\n",
      "peak-rpm              object\n",
      "city-mpg               int64\n",
      "highway-mpg            int64\n",
      "price                 object\n",
      "dtype: object\n"
     ]
    }
   ],
   "source": [
    "# check the data type of data frame \"df\" by .dtypes\n",
    "print(df.dtypes)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p>\n",
    "As a result, as shown above, it is clear to see that the data type of \"symboling\" and \"curb-weight\" are <code>int64</code>, \"normalized-losses\" is <code>object</code>, and \"wheel-base\" is <code>float64</code>, etc.\n",
    "</p>\n",
    "<p>\n",
    "These data types can be changed; we will learn how to accomplish this in a later module.\n",
    "</p>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h2>Describe</h2>\n",
    "If we would like to get a statistical summary of each column, such as count, column mean value, column standard deviation, etc. We use the describe method:"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {},
   "source": [
    "dataframe.describe()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This method will provide various summary statistics, excluding <code>NaN</code> (Not a Number) values."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>symboling</th>\n",
       "      <th>wheel-base</th>\n",
       "      <th>length</th>\n",
       "      <th>width</th>\n",
       "      <th>height</th>\n",
       "      <th>curb-weight</th>\n",
       "      <th>engine-size</th>\n",
       "      <th>compression-ratio</th>\n",
       "      <th>city-mpg</th>\n",
       "      <th>highway-mpg</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>count</td>\n",
       "      <td>205.000000</td>\n",
       "      <td>205.000000</td>\n",
       "      <td>205.000000</td>\n",
       "      <td>205.000000</td>\n",
       "      <td>205.000000</td>\n",
       "      <td>205.000000</td>\n",
       "      <td>205.000000</td>\n",
       "      <td>205.000000</td>\n",
       "      <td>205.000000</td>\n",
       "      <td>205.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>mean</td>\n",
       "      <td>0.834146</td>\n",
       "      <td>98.756585</td>\n",
       "      <td>174.049268</td>\n",
       "      <td>65.907805</td>\n",
       "      <td>53.724878</td>\n",
       "      <td>2555.565854</td>\n",
       "      <td>126.907317</td>\n",
       "      <td>10.142537</td>\n",
       "      <td>25.219512</td>\n",
       "      <td>30.751220</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>std</td>\n",
       "      <td>1.245307</td>\n",
       "      <td>6.021776</td>\n",
       "      <td>12.337289</td>\n",
       "      <td>2.145204</td>\n",
       "      <td>2.443522</td>\n",
       "      <td>520.680204</td>\n",
       "      <td>41.642693</td>\n",
       "      <td>3.972040</td>\n",
       "      <td>6.542142</td>\n",
       "      <td>6.886443</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>min</td>\n",
       "      <td>-2.000000</td>\n",
       "      <td>86.600000</td>\n",
       "      <td>141.100000</td>\n",
       "      <td>60.300000</td>\n",
       "      <td>47.800000</td>\n",
       "      <td>1488.000000</td>\n",
       "      <td>61.000000</td>\n",
       "      <td>7.000000</td>\n",
       "      <td>13.000000</td>\n",
       "      <td>16.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>25%</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>94.500000</td>\n",
       "      <td>166.300000</td>\n",
       "      <td>64.100000</td>\n",
       "      <td>52.000000</td>\n",
       "      <td>2145.000000</td>\n",
       "      <td>97.000000</td>\n",
       "      <td>8.600000</td>\n",
       "      <td>19.000000</td>\n",
       "      <td>25.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>50%</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>97.000000</td>\n",
       "      <td>173.200000</td>\n",
       "      <td>65.500000</td>\n",
       "      <td>54.100000</td>\n",
       "      <td>2414.000000</td>\n",
       "      <td>120.000000</td>\n",
       "      <td>9.000000</td>\n",
       "      <td>24.000000</td>\n",
       "      <td>30.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>75%</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>102.400000</td>\n",
       "      <td>183.100000</td>\n",
       "      <td>66.900000</td>\n",
       "      <td>55.500000</td>\n",
       "      <td>2935.000000</td>\n",
       "      <td>141.000000</td>\n",
       "      <td>9.400000</td>\n",
       "      <td>30.000000</td>\n",
       "      <td>34.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>max</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>120.900000</td>\n",
       "      <td>208.100000</td>\n",
       "      <td>72.300000</td>\n",
       "      <td>59.800000</td>\n",
       "      <td>4066.000000</td>\n",
       "      <td>326.000000</td>\n",
       "      <td>23.000000</td>\n",
       "      <td>49.000000</td>\n",
       "      <td>54.000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        symboling  wheel-base      length       width      height  \\\n",
       "count  205.000000  205.000000  205.000000  205.000000  205.000000   \n",
       "mean     0.834146   98.756585  174.049268   65.907805   53.724878   \n",
       "std      1.245307    6.021776   12.337289    2.145204    2.443522   \n",
       "min     -2.000000   86.600000  141.100000   60.300000   47.800000   \n",
       "25%      0.000000   94.500000  166.300000   64.100000   52.000000   \n",
       "50%      1.000000   97.000000  173.200000   65.500000   54.100000   \n",
       "75%      2.000000  102.400000  183.100000   66.900000   55.500000   \n",
       "max      3.000000  120.900000  208.100000   72.300000   59.800000   \n",
       "\n",
       "       curb-weight  engine-size  compression-ratio    city-mpg  highway-mpg  \n",
       "count   205.000000   205.000000         205.000000  205.000000   205.000000  \n",
       "mean   2555.565854   126.907317          10.142537   25.219512    30.751220  \n",
       "std     520.680204    41.642693           3.972040    6.542142     6.886443  \n",
       "min    1488.000000    61.000000           7.000000   13.000000    16.000000  \n",
       "25%    2145.000000    97.000000           8.600000   19.000000    25.000000  \n",
       "50%    2414.000000   120.000000           9.000000   24.000000    30.000000  \n",
       "75%    2935.000000   141.000000           9.400000   30.000000    34.000000  \n",
       "max    4066.000000   326.000000          23.000000   49.000000    54.000000  "
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.describe()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p>\n",
    "This shows the statistical summary of all numeric-typed (int, float) columns.<br>\n",
    "For example, the attribute \"symboling\" has 205 counts, the mean value of this column is 0.83, the standard deviation is 1.25, the minimum value is -2, 25th percentile is 0, 50th percentile is 1, 75th percentile is 2, and the maximum value is 3.\n",
    "<br>\n",
    "However, what if we would also like to check all the columns including those that are of type object.\n",
    "<br><br>\n",
    "\n",
    "You can add an argument <code>include = \"all\"</code> inside the bracket. Let's try it again.\n",
    "</p>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>symboling</th>\n",
       "      <th>normalized-losses</th>\n",
       "      <th>make</th>\n",
       "      <th>fuel-type</th>\n",
       "      <th>aspiration</th>\n",
       "      <th>num-of-doors</th>\n",
       "      <th>body-style</th>\n",
       "      <th>drive-wheels</th>\n",
       "      <th>engine-location</th>\n",
       "      <th>wheel-base</th>\n",
       "      <th>...</th>\n",
       "      <th>engine-size</th>\n",
       "      <th>fuel-system</th>\n",
       "      <th>bore</th>\n",
       "      <th>stroke</th>\n",
       "      <th>compression-ratio</th>\n",
       "      <th>horsepower</th>\n",
       "      <th>peak-rpm</th>\n",
       "      <th>city-mpg</th>\n",
       "      <th>highway-mpg</th>\n",
       "      <th>price</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>count</td>\n",
       "      <td>205.000000</td>\n",
       "      <td>205</td>\n",
       "      <td>205</td>\n",
       "      <td>205</td>\n",
       "      <td>205</td>\n",
       "      <td>205</td>\n",
       "      <td>205</td>\n",
       "      <td>205</td>\n",
       "      <td>205</td>\n",
       "      <td>205.000000</td>\n",
       "      <td>...</td>\n",
       "      <td>205.000000</td>\n",
       "      <td>205</td>\n",
       "      <td>205</td>\n",
       "      <td>205</td>\n",
       "      <td>205.000000</td>\n",
       "      <td>205</td>\n",
       "      <td>205</td>\n",
       "      <td>205.000000</td>\n",
       "      <td>205.000000</td>\n",
       "      <td>205</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>unique</td>\n",
       "      <td>NaN</td>\n",
       "      <td>52</td>\n",
       "      <td>22</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "      <td>5</td>\n",
       "      <td>3</td>\n",
       "      <td>2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>8</td>\n",
       "      <td>39</td>\n",
       "      <td>37</td>\n",
       "      <td>NaN</td>\n",
       "      <td>60</td>\n",
       "      <td>24</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>187</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>top</td>\n",
       "      <td>NaN</td>\n",
       "      <td>?</td>\n",
       "      <td>toyota</td>\n",
       "      <td>gas</td>\n",
       "      <td>std</td>\n",
       "      <td>four</td>\n",
       "      <td>sedan</td>\n",
       "      <td>fwd</td>\n",
       "      <td>front</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>mpfi</td>\n",
       "      <td>3.62</td>\n",
       "      <td>3.40</td>\n",
       "      <td>NaN</td>\n",
       "      <td>68</td>\n",
       "      <td>5500</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>?</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>freq</td>\n",
       "      <td>NaN</td>\n",
       "      <td>41</td>\n",
       "      <td>32</td>\n",
       "      <td>185</td>\n",
       "      <td>168</td>\n",
       "      <td>114</td>\n",
       "      <td>96</td>\n",
       "      <td>120</td>\n",
       "      <td>202</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>94</td>\n",
       "      <td>23</td>\n",
       "      <td>20</td>\n",
       "      <td>NaN</td>\n",
       "      <td>19</td>\n",
       "      <td>37</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>mean</td>\n",
       "      <td>0.834146</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>98.756585</td>\n",
       "      <td>...</td>\n",
       "      <td>126.907317</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>10.142537</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>25.219512</td>\n",
       "      <td>30.751220</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>std</td>\n",
       "      <td>1.245307</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6.021776</td>\n",
       "      <td>...</td>\n",
       "      <td>41.642693</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>3.972040</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6.542142</td>\n",
       "      <td>6.886443</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>min</td>\n",
       "      <td>-2.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>86.600000</td>\n",
       "      <td>...</td>\n",
       "      <td>61.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>7.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>13.000000</td>\n",
       "      <td>16.000000</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>25%</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>94.500000</td>\n",
       "      <td>...</td>\n",
       "      <td>97.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>8.600000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>19.000000</td>\n",
       "      <td>25.000000</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>50%</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>97.000000</td>\n",
       "      <td>...</td>\n",
       "      <td>120.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>9.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>24.000000</td>\n",
       "      <td>30.000000</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>75%</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>102.400000</td>\n",
       "      <td>...</td>\n",
       "      <td>141.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>9.400000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>30.000000</td>\n",
       "      <td>34.000000</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>max</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>120.900000</td>\n",
       "      <td>...</td>\n",
       "      <td>326.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>23.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>49.000000</td>\n",
       "      <td>54.000000</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>11 rows × 26 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "         symboling normalized-losses    make fuel-type aspiration  \\\n",
       "count   205.000000               205     205       205        205   \n",
       "unique         NaN                52      22         2          2   \n",
       "top            NaN                 ?  toyota       gas        std   \n",
       "freq           NaN                41      32       185        168   \n",
       "mean      0.834146               NaN     NaN       NaN        NaN   \n",
       "std       1.245307               NaN     NaN       NaN        NaN   \n",
       "min      -2.000000               NaN     NaN       NaN        NaN   \n",
       "25%       0.000000               NaN     NaN       NaN        NaN   \n",
       "50%       1.000000               NaN     NaN       NaN        NaN   \n",
       "75%       2.000000               NaN     NaN       NaN        NaN   \n",
       "max       3.000000               NaN     NaN       NaN        NaN   \n",
       "\n",
       "       num-of-doors body-style drive-wheels engine-location  wheel-base  ...  \\\n",
       "count           205        205          205             205  205.000000  ...   \n",
       "unique            3          5            3               2         NaN  ...   \n",
       "top            four      sedan          fwd           front         NaN  ...   \n",
       "freq            114         96          120             202         NaN  ...   \n",
       "mean            NaN        NaN          NaN             NaN   98.756585  ...   \n",
       "std             NaN        NaN          NaN             NaN    6.021776  ...   \n",
       "min             NaN        NaN          NaN             NaN   86.600000  ...   \n",
       "25%             NaN        NaN          NaN             NaN   94.500000  ...   \n",
       "50%             NaN        NaN          NaN             NaN   97.000000  ...   \n",
       "75%             NaN        NaN          NaN             NaN  102.400000  ...   \n",
       "max             NaN        NaN          NaN             NaN  120.900000  ...   \n",
       "\n",
       "        engine-size  fuel-system  bore  stroke compression-ratio horsepower  \\\n",
       "count    205.000000          205   205     205        205.000000        205   \n",
       "unique          NaN            8    39      37               NaN         60   \n",
       "top             NaN         mpfi  3.62    3.40               NaN         68   \n",
       "freq            NaN           94    23      20               NaN         19   \n",
       "mean     126.907317          NaN   NaN     NaN         10.142537        NaN   \n",
       "std       41.642693          NaN   NaN     NaN          3.972040        NaN   \n",
       "min       61.000000          NaN   NaN     NaN          7.000000        NaN   \n",
       "25%       97.000000          NaN   NaN     NaN          8.600000        NaN   \n",
       "50%      120.000000          NaN   NaN     NaN          9.000000        NaN   \n",
       "75%      141.000000          NaN   NaN     NaN          9.400000        NaN   \n",
       "max      326.000000          NaN   NaN     NaN         23.000000        NaN   \n",
       "\n",
       "        peak-rpm    city-mpg highway-mpg price  \n",
       "count        205  205.000000  205.000000   205  \n",
       "unique        24         NaN         NaN   187  \n",
       "top         5500         NaN         NaN     ?  \n",
       "freq          37         NaN         NaN     4  \n",
       "mean         NaN   25.219512   30.751220   NaN  \n",
       "std          NaN    6.542142    6.886443   NaN  \n",
       "min          NaN   13.000000   16.000000   NaN  \n",
       "25%          NaN   19.000000   25.000000   NaN  \n",
       "50%          NaN   24.000000   30.000000   NaN  \n",
       "75%          NaN   30.000000   34.000000   NaN  \n",
       "max          NaN   49.000000   54.000000   NaN  \n",
       "\n",
       "[11 rows x 26 columns]"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# describe all the columns in \"df\" \n",
    "df.describe(include = \"all\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p>\n",
    "Now, it provides the statistical summary of all the columns, including object-typed attributes.<br>\n",
    "We can now see how many unique values, which is the top value and the frequency of top value in the object-typed columns.<br>\n",
    "Some values in the table above show as \"NaN\", this is because those numbers are not available regarding a particular column type.<br>\n",
    "</p>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-danger alertdanger\" style=\"margin-top: 20px\">\n",
    "<h1> Question #3: </h1>\n",
    "\n",
    "<p>\n",
    "You can select the columns of a data frame by indicating the name of  each column, for example, you can select the three columns as follows:\n",
    "</p>\n",
    "<p>\n",
    "    <code>dataframe[[' column 1 ',column 2', 'column 3']]</code>\n",
    "</p>\n",
    "<p>\n",
    "Where \"column\" is the name of the column, you can apply the method  \".describe()\" to get the statistics of those columns as follows:\n",
    "</p>\n",
    "<p>\n",
    "    <code>dataframe[[' column 1 ',column 2', 'column 3'] ].describe()</code>\n",
    "</p>\n",
    "\n",
    "Apply the  method to \".describe()\" to the columns 'length' and 'compression-ratio'.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>length</th>\n",
       "      <th>compression-ratio</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>count</td>\n",
       "      <td>205.000000</td>\n",
       "      <td>205.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>mean</td>\n",
       "      <td>174.049268</td>\n",
       "      <td>10.142537</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>std</td>\n",
       "      <td>12.337289</td>\n",
       "      <td>3.972040</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>min</td>\n",
       "      <td>141.100000</td>\n",
       "      <td>7.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>25%</td>\n",
       "      <td>166.300000</td>\n",
       "      <td>8.600000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>50%</td>\n",
       "      <td>173.200000</td>\n",
       "      <td>9.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>75%</td>\n",
       "      <td>183.100000</td>\n",
       "      <td>9.400000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>max</td>\n",
       "      <td>208.100000</td>\n",
       "      <td>23.000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "           length  compression-ratio\n",
       "count  205.000000         205.000000\n",
       "mean   174.049268          10.142537\n",
       "std     12.337289           3.972040\n",
       "min    141.100000           7.000000\n",
       "25%    166.300000           8.600000\n",
       "50%    173.200000           9.000000\n",
       "75%    183.100000           9.400000\n",
       "max    208.100000          23.000000"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Write your code below and press Shift+Enter to execute \n",
    "df[['length', 'compression-ratio']].describe()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "jupyter": {
     "source_hidden": true
    }
   },
   "source": [
    "Double-click <b>here</b> for the solution.\n",
    "\n",
    "<!-- The answer is below:\n",
    "\n",
    "df[['length', 'compression-ratio']].describe()\n",
    "\n",
    "-->\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h2>Info</h2>\n",
    "Another method you can use to check your dataset is:"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {},
   "source": [
    "dataframe.info"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It provide a concise summary of your DataFrame."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<bound method DataFrame.info of      symboling normalized-losses         make fuel-type aspiration  \\\n",
       "0            3                 ?  alfa-romero       gas        std   \n",
       "1            3                 ?  alfa-romero       gas        std   \n",
       "2            1                 ?  alfa-romero       gas        std   \n",
       "3            2               164         audi       gas        std   \n",
       "4            2               164         audi       gas        std   \n",
       "..         ...               ...          ...       ...        ...   \n",
       "200         -1                95        volvo       gas        std   \n",
       "201         -1                95        volvo       gas      turbo   \n",
       "202         -1                95        volvo       gas        std   \n",
       "203         -1                95        volvo    diesel      turbo   \n",
       "204         -1                95        volvo       gas      turbo   \n",
       "\n",
       "    num-of-doors   body-style drive-wheels engine-location  wheel-base  ...  \\\n",
       "0            two  convertible          rwd           front        88.6  ...   \n",
       "1            two  convertible          rwd           front        88.6  ...   \n",
       "2            two    hatchback          rwd           front        94.5  ...   \n",
       "3           four        sedan          fwd           front        99.8  ...   \n",
       "4           four        sedan          4wd           front        99.4  ...   \n",
       "..           ...          ...          ...             ...         ...  ...   \n",
       "200         four        sedan          rwd           front       109.1  ...   \n",
       "201         four        sedan          rwd           front       109.1  ...   \n",
       "202         four        sedan          rwd           front       109.1  ...   \n",
       "203         four        sedan          rwd           front       109.1  ...   \n",
       "204         four        sedan          rwd           front       109.1  ...   \n",
       "\n",
       "     engine-size  fuel-system  bore  stroke compression-ratio horsepower  \\\n",
       "0            130         mpfi  3.47    2.68               9.0        111   \n",
       "1            130         mpfi  3.47    2.68               9.0        111   \n",
       "2            152         mpfi  2.68    3.47               9.0        154   \n",
       "3            109         mpfi  3.19    3.40              10.0        102   \n",
       "4            136         mpfi  3.19    3.40               8.0        115   \n",
       "..           ...          ...   ...     ...               ...        ...   \n",
       "200          141         mpfi  3.78    3.15               9.5        114   \n",
       "201          141         mpfi  3.78    3.15               8.7        160   \n",
       "202          173         mpfi  3.58    2.87               8.8        134   \n",
       "203          145          idi  3.01    3.40              23.0        106   \n",
       "204          141         mpfi  3.78    3.15               9.5        114   \n",
       "\n",
       "     peak-rpm city-mpg highway-mpg  price  \n",
       "0        5000       21          27  13495  \n",
       "1        5000       21          27  16500  \n",
       "2        5000       19          26  16500  \n",
       "3        5500       24          30  13950  \n",
       "4        5500       18          22  17450  \n",
       "..        ...      ...         ...    ...  \n",
       "200      5400       23          28  16845  \n",
       "201      5300       19          25  19045  \n",
       "202      5500       18          23  21485  \n",
       "203      4800       26          27  22470  \n",
       "204      5400       19          25  22625  \n",
       "\n",
       "[205 rows x 26 columns]>"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# look at the info of \"df\"\n",
    "df.info"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p>\n",
    "Here we are able to see the information of our dataframe, with the top 30 rows and the bottom 30 rows.\n",
    "<br><br>\n",
    "And, it also shows us the whole data frame has 205 rows and 26 columns in total.\n",
    "</p>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h1>Excellent! You have just completed the  Introduction  Notebook!</h1>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-block alert-info\" style=\"margin-top: 20px\">\n",
    "\n",
    "    <p><a href=\"https://cocl.us/corsera_da0101en_notebook_bottom\"><img src=\"https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DA0101EN/Images/BottomAd.png\" width=\"750\" align=\"center\"></a></p>\n",
    "</div>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h3>About the Authors:</h3>\n",
    "\n",
    "This notebook was written by <a href=\"https://www.linkedin.com/in/mahdi-noorian-58219234/\" target=\"_blank\">Mahdi Noorian PhD</a>, <a href=\"https://www.linkedin.com/in/joseph-s-50398b136/\" target=\"_blank\">Joseph Santarcangelo</a>, Bahare Talayian, Eric Xiao, Steven Dong, Parizad, Hima Vsudevan and <a href=\"https://www.linkedin.com/in/fiorellawever/\" target=\"_blank\">Fiorella Wenver</a> and <a href=\" https://www.linkedin.com/in/yi-leng-yao-84451275/ \" target=\"_blank\" >Yi Yao</a>.\n",
    "\n",
    "<p><a href=\"https://www.linkedin.com/in/joseph-s-50398b136/\" target=\"_blank\">Joseph Santarcangelo</a> is a Data Scientist at IBM, and holds a PhD in Electrical Engineering. His research focused on using Machine Learning, Signal Processing, and Computer Vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD.</p>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<hr>\n",
    "<p>Copyright &copy; 2018 IBM Developer Skills Network. This notebook and its source code are released under the terms of the <a href=\"https://cognitiveclass.ai/mit-license/\">MIT License</a>.</p>"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python",
   "language": "python",
   "name": "conda-env-python-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
